Text-to-Image Generation based on Crossmodal Association with Hierarchical Hypergraphs
نویسندگان
چکیده
In this paper, we propose a novel framework for text-to-image generation based on association between text and image modalities. As an association model, we use hierarchical hypergraphs which consist of two layers including heterogeneous hypergraphs. While the first layer is composed of two hypergraphs: a text hypergraph and an image hypergraph, a hypergraph in the second layer associates two modalities by merging two hypergraphs in the first layer. In our model, hypergraphs are learned by self-organizing method based on random sampling. With multimodal association represented in the learned model, an intermediate image is generated by cross-modal inference when text keywords are given as a query. We use Korean magazine articles as a text-image data for experiments and we illustrate generated intermediate images and retrieved images similar to the intermediates as experimental results.
منابع مشابه
Improvement of generative adversarial networks for automatic text-to-image generation
This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...
متن کاملSemiautomatic Image Retrieval Using the High Level Semantic Labels
Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملCross-Modal Image Clustering via Canonical Correlation Analysis
A new algorithm via Canonical Correlation Analysis (CCA) is developed in this paper to support more effective crossmodal image clustering for large-scale annotated image collections. It can be treated as a bi-media multimodal mapping problem and modeled as a correlation distribution over multimodal feature representations. It integrates the multimodal feature generation with the Locality Linear...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملDocument Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)
Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013